In our final project, we begin with a corpus of images, assemble a classification algorithm for said corpus, and evaluate how well it works. We started with a preconstructed dataset of pictures of food.
reticulate::use_python("/anaconda3/bin/python")
library(readr)
library(ggplot2)
library(dplyr)
library(methods)
library(stringi)
library(keras)
Read in the following libraries:
library(readr)
library(dplyr)
library(glmnet)
## Loading required package: Matrix
## Loading required package: foreach
## Loaded glmnet 2.0-16
library(keras)
Load in the dataset rds and csv created in the previous rmd.
X <- read_rds("my-image-embed.rds")
image_data <- read_csv("my-image-data.csv")
## Parsed with column specification:
## cols(
## obs_id = col_character(),
## train_id = col_character(),
## class = col_double(),
## class_name = col_character(),
## path_to_image = col_character()
## )
Now we need to create a training dataset and define our variables. to_categorical, used in the code below, converts numeric class labels to binary indicator variables, called a one-hot encoding. Moreover, to_categorical creates a factor matrix, but it necessary to create a y_old variable to help with image classification later.
y_old <- image_data$class
X_train <- X[image_data$train_id == "train",]
y <- to_categorical(image_data$class)
y_train<- to_categorical(image_data$class[image_data$train_id == "train"])
Next we fit a neural network using the keras model to our exisiting dataset. We use leaky_relu instead of relu to make sure there are no dead neurons. We, also, use three hidden layers of 512 parameters each.
model <- keras_model_sequential()
model %>%
layer_dense(units = 512, input_shape = ncol(X_train)) %>%
layer_activation_leaky_relu %>%
layer_dropout(rate = 0.5) %>%
layer_dense(units = 512) %>%
layer_activation_leaky_relu %>%
layer_dropout(rate = 0.5) %>%
layer_dense(units = 512) %>%
layer_activation_leaky_relu %>%
layer_dropout(rate = 0.5) %>%
layer_dense(units = ncol(y_train)) %>%
layer_activation(activation = "softmax")
model %>% compile(loss = 'categorical_crossentropy',
optimizer = optimizer_rmsprop(lr = 0.0001),
metrics = c('accuracy'))
history <- model %>%
fit(X_train, y_train, epochs = 8)
plot(history)
Next, lets look at a few sample images. We have multiple folders and need to find our file path. After this is found we can read a few images as samples. First, we find and display a sample image of chocolate cake.
image_path <- "C:/Users/zachk/Desktop/Stat Learning/food-101/food-101/food-101/images/chocolate_cake/62855.jpg"
image <- image_load(image_path, target_size = c(224,224))
image <- image_to_array(image)
image <- array_reshape(image, c(1, dim(image)))
dim(image)
## [1] 1 224 224 3
par(mar = rep(0, 4L))
plot(0,0,xlim=c(0,1),ylim=c(0,1),axes= FALSE, type = "n", asp=1)
rasterImage(image[1,,,] / 255,0,0,1,1)
After this has been found we scour our archives and look for another sample image. Our archive consists entirely of deserts and we wanted to find a different category of desert that looks relatively similar. This picture of chocolate mousse below looks as if it could be a chocolate cake, but it is not. This shows the difficult task our algorithm will have to accomplish when distinguishing between these deserts.
image_path <- "C:/Users/zachk/Desktop/Stat Learning/food-101/food-101/food-101/images/chocolate_mousse/153840.jpg"
image <- image_load(image_path, target_size = c(224,224))
image <- image_to_array(image)
image <- array_reshape(image, c(1, dim(image)))
dim(image)
## [1] 1 224 224 3
par(mar = rep(0, 4L))
plot(0,0,xlim=c(0,1),ylim=c(0,1),axes= FALSE, type = "n", asp=1)
rasterImage(image[1,,,] / 255,0,0,1,1)
After we have plotted the history we then move forward to examining how well the model makes predictions. the training set is fit at around an 80% mark, but the validation set is only fit around 72% correct.
y_pred <- predict_classes(model, X)
image_data$y_pred <- y_pred
tapply(image_data$class == y_pred, image_data$train_id, mean)
## train valid
## 0.7925000 0.7273214
Then we can create the confusion matrix to see how different all of the categories of desert are as shown below. We can see which deserts can get misclasified as one another most frequently and confuse the algorithm. The most comon overlaps include: cupcakes and red vevlvet cake, and chocolate cake and carrot cake.
class_names <- unique(image_data$class_name)
table(value = class_names[y_old + 1L], prediction = class_names[y_pred + 1L], image_data$train_id)
## , , = train
##
## prediction
## value apple_pie bread_pudding cannoli carrot_cake cheesecake
## apple_pie 539 5 3 4 1
## bread_pudding 9 499 3 12 3
## cannoli 11 4 397 14 2
## carrot_cake 4 15 6 419 1
## cheesecake 1 0 1 2 564
## chocolate_cake 2 2 18 91 3
## chocolate_mousse 0 15 1 7 1
## churros 2 1 14 0 0
## creme_brulee 7 5 2 13 0
## cup_cakes 12 2 29 3 11
## donuts 4 1 7 1 2
## frozen_yogurt 3 3 11 12 33
## ice_cream 5 0 15 18 1
## red_velvet_cake 9 2 22 2 3
## prediction
## value chocolate_cake chocolate_mousse churros creme_brulee
## apple_pie 4 2 2 18
## bread_pudding 4 18 4 21
## cannoli 11 14 44 8
## carrot_cake 42 4 2 33
## cheesecake 3 4 0 1
## chocolate_cake 380 0 4 20
## chocolate_mousse 2 527 18 1
## churros 1 21 516 1
## creme_brulee 3 2 3 535
## cup_cakes 9 11 12 10
## donuts 7 1 0 2
## frozen_yogurt 13 13 8 17
## ice_cream 9 7 25 18
## red_velvet_cake 5 7 25 13
## prediction
## value cup_cakes donuts frozen_yogurt ice_cream
## apple_pie 6 3 2 4
## bread_pudding 3 6 6 4
## cannoli 19 8 3 43
## carrot_cake 5 5 14 41
## cheesecake 3 1 17 2
## chocolate_cake 11 14 11 31
## chocolate_mousse 5 2 7 5
## churros 6 0 4 18
## creme_brulee 4 3 7 10
## cup_cakes 352 21 13 12
## donuts 9 554 4 6
## frozen_yogurt 36 1 439 4
## ice_cream 5 2 5 475
## red_velvet_cake 26 4 2 19
## prediction
## value red_velvet_cake
## apple_pie 7
## bread_pudding 8
## cannoli 22
## carrot_cake 9
## cheesecake 1
## chocolate_cake 13
## chocolate_mousse 9
## churros 16
## creme_brulee 6
## cup_cakes 103
## donuts 2
## frozen_yogurt 7
## ice_cream 15
## red_velvet_cake 461
##
## , , = valid
##
## prediction
## value apple_pie bread_pudding cannoli carrot_cake cheesecake
## apple_pie 349 4 1 7 2
## bread_pudding 10 318 2 10 3
## cannoli 7 3 225 11 1
## carrot_cake 5 6 4 250 0
## cheesecake 2 0 2 2 360
## chocolate_cake 1 3 21 73 11
## chocolate_mousse 1 12 2 7 1
## churros 2 1 15 0 0
## creme_brulee 3 3 4 12 2
## cup_cakes 6 5 24 4 7
## donuts 7 5 4 12 2
## frozen_yogurt 3 5 10 13 31
## ice_cream 4 1 9 19 1
## red_velvet_cake 6 9 22 0 3
## prediction
## value chocolate_cake chocolate_mousse churros creme_brulee
## apple_pie 3 1 2 8
## bread_pudding 2 14 2 17
## cannoli 6 8 33 14
## carrot_cake 43 0 0 37
## cheesecake 3 7 1 1
## chocolate_cake 204 2 4 16
## chocolate_mousse 2 325 17 5
## churros 0 14 329 4
## creme_brulee 5 2 2 343
## cup_cakes 14 7 10 13
## donuts 7 1 0 1
## frozen_yogurt 7 9 3 4
## ice_cream 7 11 27 10
## red_velvet_cake 3 6 29 11
## prediction
## value cup_cakes donuts frozen_yogurt ice_cream
## apple_pie 6 6 3 4
## bread_pudding 4 3 8 1
## cannoli 10 18 4 44
## carrot_cake 2 4 9 36
## cheesecake 6 1 11 3
## chocolate_cake 13 12 7 25
## chocolate_mousse 2 1 7 5
## churros 6 2 3 16
## creme_brulee 4 2 9 6
## cup_cakes 202 10 23 12
## donuts 13 342 0 4
## frozen_yogurt 34 3 268 2
## ice_cream 3 1 4 286
## red_velvet_cake 14 1 10 14
## prediction
## value red_velvet_cake
## apple_pie 4
## bread_pudding 6
## cannoli 16
## carrot_cake 4
## cheesecake 1
## chocolate_cake 8
## chocolate_mousse 13
## churros 8
## creme_brulee 3
## cup_cakes 63
## donuts 2
## frozen_yogurt 8
## ice_cream 17
## red_velvet_cake 272
Next, we read in the dataset first inputted into the file. We also show some negative examples. These are images that our code puts in the wrong category. For insatnce, as you can see, the code thinks the bread pudding below is a cannoli probably because of it’s similar shape and color.
class_vector <- image_data$class_name
class_names <- levels(factor(image_data$class_name))
par(mfrow = c(2, 3))
id <- sample(which(y_pred != y_old), 100)
for (i in id) {
par(mar = rep(0, 4L))
plot(0,0,xlim=c(0,1),ylim=c(0,1),axes= FALSE,type = "n")
Z <- image_to_array(image_load(image_data$path_to_image[i], target_size = c(224,224)))
rasterImage(Z /255,0,0,1,1)
text(0.5, 0.1, label = class_names[y_pred[i] + 1L], col = "red", cex=2)
text(0.5, 0.2, label = class_names[y[i] + 1L], col = "green", cex=2)
}